Adaptative Filtering of Multilingual Document Streams

نویسنده

  • Douglas W. Oard
چکیده

The increasingly ubiquitous global information structure makes it possible to examine high volume text streams that contain documents written in a variety of languages Present monolingual adap tive ltering techniques learn pro les which re ect user preferences and then apply those pro les to reduce the volume of new documents that must be examined by the user to manageable levels This paper presents three techniques for extending adaptive monolingual text ltering techniques to manage multilingual document streams Experimental results are given which demonstrate that dictionary based and corpus based techniques achieve similar performance in this application This observation motivates our development of a translation technique designed speci cally for vector space text representations which can in principle exploit both dictionary based and corpus based techniques Results of initial experiments with this technique are given and the potential advan tages of the new technique are discussed The paper concludes with a discussion of future directions for adaptive multilingual text ltering

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Filtering of Multilingual Document Streams

The increasingly ubiquitous global information structure makes it possible to examine high-volume text streams that contain documents written in a variety of languages. Present monolingual adap-tive ltering techniques learn prooles which reeect user preferences and then apply those prooles to reduce the volume of new documents that must be examined by the user to manageable levels. This paper p...

متن کامل

Natural language descriptions for video streams

Digital images and videos collection has increased exponentially in the recent years as more and more data is available in the form of personal photo albums, handheld camera videos, feature films and multilingual broadcast news videos, presenting visual data ranging from unstructured to highly structured. Today video data accounts for 80 percent of all network traffic. There is a need for quali...

متن کامل

Text Categorization for Internet Content Filtering

Text Filtering is one of the most challenging and useful tasks in the Multilingual Information Access field. In a number of filtering applications, Automated Text Categorization of documents plays a key role. In this paper, we present two of that applications (Hermes and POESIA), focused on personalized news delivery and Internet inappropriate content blocking, respectively. We are specifically...

متن کامل

Cascading XSL Filters for Content Selection in Multilingual Document Generation

Content selection is a key factor of any successful document generation system. This paper shows how a content selection algorithm has been implemented using an efficient combination of XML/XSL technology and the framework of RST for discourse modeling. The system generates multilingual documents adapted to user profiles in a learning environment for the web. This CourseViewGenerator applies si...

متن کامل

Adaptive Information Filtering concepts and algorithms

Adaptive information filtering is concerned with filtering information streams in dynamic (changing) environments. The changes may occur both on the transmission side — the nature of the streams can change — and on the reception side — the interests of the user (or group of users) can change. While information filtering and information retrieval have a lot in common, this dissertation’s primary...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997